386 research outputs found
Superpixel-based Semantic Segmentation Trained by Statistical Process Control
Semantic segmentation, like other fields of computer vision, has seen a
remarkable performance advance by the use of deep convolution neural networks.
However, considering that neighboring pixels are heavily dependent on each
other, both learning and testing of these methods have a lot of redundant
operations. To resolve this problem, the proposed network is trained and tested
with only 0.37% of total pixels by superpixel-based sampling and largely
reduced the complexity of upsampling calculation. The hypercolumn feature maps
are constructed by pyramid module in combination with the convolution layers of
the base network. Since the proposed method uses a very small number of sampled
pixels, the end-to-end learning of the entire network is difficult with a
common learning rate for all the layers. In order to resolve this problem, the
learning rate after sampling is controlled by statistical process control (SPC)
of gradients in each layer. The proposed method performs better than or equal
to the conventional methods that use much more samples on Pascal Context,
SUN-RGBD dataset.Comment: Accepted in British Machine Vision Conference (BMVC), 201
Frontal top-down signals increase coupling of auditory low-frequency oscillations to continuous speech in human listeners
Humans show a remarkable ability to understand continuous speech even under adverse listening conditions. This ability critically relies on dynamically updated predictions of incoming sensory information, but exactly how top-down predictions improve speech processing is still unclear. Brain oscillations are a likely mechanism for these top-down predictions [1 and 2]. Quasi-rhythmic components in speech are known to entrain low-frequency oscillations in auditory areas [3 and 4], and this entrainment increases with intelligibility [5]. We hypothesize that top-down signals from frontal brain areas causally modulate the phase of brain oscillations in auditory cortex. We use magnetoencephalography (MEG) to monitor brain oscillations in 22 participants during continuous speech perception. We characterize prominent spectral components of speech-brain coupling in auditory cortex and use causal connectivity analysis (transfer entropy) to identify the top-down signals driving this coupling more strongly during intelligible speech than during unintelligible speech. We report three main findings. First, frontal and motor cortices significantly modulate the phase of speech-coupled low-frequency oscillations in auditory cortex, and this effect depends on intelligibility of speech. Second, top-down signals are significantly stronger for left auditory cortex than for right auditory cortex. Third, speech-auditory cortex coupling is enhanced as a function of stronger top-down signals. Together, our results suggest that low-frequency brain oscillations play a role in implementing predictive top-down control during continuous speech perception and that top-down control is largely directed at left auditory cortex. This suggests a close relationship between (left-lateralized) speech production areas and the implementation of top-down control in continuous speech perception
Predictive entrainment of natural speech through two fronto-motor top-down channels
Natural communication between interlocutors is enabled by the ability to predict upcoming speech in a given context. Previously we showed that these predictions rely on a fronto-motor top-down control of low-frequency oscillations in auditory-temporal brain areas that track intelligible speech. However, a comprehensive spatio-temporal characterisation of this effect is still missing. Here, we applied transfer entropy to source-localised MEG data during continuous speech perception. First, at low frequencies (1–4 Hz, brain delta phase to speech delta phase), predictive effects start in left fronto-motor regions and progress to right temporal regions. Second, at higher frequencies (14–18 Hz, brain beta power to speech delta phase), predictive patterns show a transition from left inferior frontal gyrus via left precentral gyrus to left primary auditory areas. Our results suggest a progression of prediction processes from higher-order to early sensory areas in at least two different frequency channels
Validation of Yoon's Critical Thinking Disposition Instrument
SummaryPurposeThe lack of reliable and valid evaluation tools targeting Korean nursing students' critical thinking (CT) abilities has been reported as one of the barriers to instructing and evaluating students in undergraduate programs. Yoon's Critical Thinking Disposition (YCTD) instrument was developed for Korean nursing students, but few studies have assessed its validity. This study aimed to validate the YCTD. Specifically, the YCTD was assessed to identify its cross-sectional and longitudinal measurement invariance.MethodsThis was a validation study in which a cross-sectional and longitudinal (prenursing and postnursing practicum) survey was used to validate the YCTD using 345 nursing students at three universities in Seoul, Korea. The participants' CT abilities were assessed using the YCTD before and after completing an established pediatric nursing practicum. The validity of the YCTD was estimated and then group invariance test using multigroup confirmatory factor analysis was performed to confirm the measurement compatibility of multigroups.ResultsA test of the seven-factor model showed that the YCTD demonstrated good construct validity. Multigroup confirmatory factor analysis findings for the measurement invariance suggested that this model structure demonstrated strong invariance between groups (i.e., configural, factor loading, and intercept combined) but weak invariance within a group (i.e., configural and factor loading combined).ConclusionsIn general, traditional methods for assessing instrument validity have been less than thorough. In this study, multigroup confirmatory factor analysis using cross-sectional and longitudinal measurement data allowed validation of the YCTD. This study concluded that the YCTD can be used for evaluating Korean nursing students' CT abilities
Test-time Adaptation vs. Training-time Generalization: A Case Study in Human Instance Segmentation using Keypoints Estimation
We consider the problem of improving the human instance segmentation mask
quality for a given test image using keypoints estimation. We compare two
alternative approaches. The first approach is a test-time adaptation (TTA)
method, where we allow test-time modification of the segmentation network's
weights using a single unlabeled test image. In this approach, we do not assume
test-time access to the labeled source dataset. More specifically, our TTA
method consists of using the keypoints estimates as pseudo labels and
backpropagating them to adjust the backbone weights. The second approach is a
training-time generalization (TTG) method, where we permit offline access to
the labeled source dataset but not the test-time modification of weights.
Furthermore, we do not assume the availability of any images from or knowledge
about the target domain. Our TTG method consists of augmenting the backbone
features with those generated by the keypoints head and feeding the aggregate
vector to the mask head. Through a comprehensive set of ablations, we evaluate
both approaches and identify several factors limiting the TTA gains. In
particular, we show that in the absence of a significant domain shift, TTA may
hurt and TTG show only a small gain in performance, whereas for a large domain
shift, TTA gains are smaller and dependent on the heuristics used, while TTG
gains are larger and robust to architectural choices
MEGAN: Mixture of Experts of Generative Adversarial Networks for Multimodal Image Generation
Recently, generative adversarial networks (GANs) have shown promising
performance in generating realistic images. However, they often struggle in
learning complex underlying modalities in a given dataset, resulting in
poor-quality generated images. To mitigate this problem, we present a novel
approach called mixture of experts GAN (MEGAN), an ensemble approach of
multiple generator networks. Each generator network in MEGAN specializes in
generating images with a particular subset of modalities, e.g., an image class.
Instead of incorporating a separate step of handcrafted clustering of multiple
modalities, our proposed model is trained through an end-to-end learning of
multiple generators via gating networks, which is responsible for choosing the
appropriate generator network for a given condition. We adopt the categorical
reparameterization trick for a categorical decision to be made in selecting a
generator while maintaining the flow of the gradients. We demonstrate that
individual generators learn different and salient subparts of the data and
achieve a multiscale structural similarity (MS-SSIM) score of 0.2470 for CelebA
and a competitive unsupervised inception score of 8.33 in CIFAR-10.Comment: 27th International Joint Conference on Artificial Intelligence (IJCAI
2018
Gating of memory encoding of time-delayed cross-frequency MEG networks revealed by graph filtration based on persistent homology
To explain gating of memory encoding, magnetoencephalography (MEG) was analyzed over multi-regional network of negative correlations between alpha band power during cue (cue-alpha) and gamma band power during item presentation (item-gamma) in Remember (R) and No-remember (NR) condition. Persistent homology with graph filtration on alpha-gamma correlation disclosed topological invariants to explain memory gating. Instruction compliance (R-hits minus NR-hits) was significantly related to negative coupling between the left superior occipital (cue-alpha) and the left dorsolateral superior frontal gyri (item-gamma) on permutation test, where the coupling was stronger in R than NR. In good memory performers (R-hits minus false alarm), the coupling was stronger in R than NR between the right posterior cingulate (cue-alpha) and the left fusiform gyri (item-gamma). Gating of memory encoding was dictated by inter-regional negative alpha-gamma coupling. Our graph filtration over MEG network revealed these inter-regional time-delayed cross-frequency connectivity serve gating of memory encoding
- …